fix: add shlex fallback parser and heredoc warning by cabana8471-arch · Pull Request #127 · AutoForgeAI/autoforge

cabana8471-arch · 2026-01-29T07:04:26Z

Summary

Two focused bug fixes discovered from production log analysis:

shlex fallback parser - Complex docker commands with nested quotes (e.g., docker exec container php -r "echo \"test\";") caused shlex.split() to fail, which previously returned an empty list and blocked the entire command. Now extracts the primary command via fallback parsing.
heredoc warning - Sandbox mode blocks /tmp access, causing heredoc-style git commits to fail with "can't create temp file for here document". Added clear guidance to use multiple -m flags instead.

Changes

File	Change
`security.py`	Add `_extract_primary_command()` fallback when shlex fails
`.claude/templates/coding_prompt.template.md`	Add heredoc warning with alternative patterns

Test plan

extract_commands('docker exec container php -r "echo \"test\";"') returns ['docker'] (was [])
All 162 security unit tests pass
ruff check security.py passes

🤖 Generated with Claude Code

Summary by CodeRabbit

Documentation
- Updated development commit message guidelines with comprehensive rules, examples, and formatting standards for multi-line messages.
Bug Fixes
- Improved command extraction resilience with fallback mechanisms to gracefully handle malformed or complex segments.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

- Add _extract_primary_command() fallback when shlex.split() fails on complex nested quotes (e.g., docker exec with PHP) - Returns primary command instead of empty list, allowing valid commands to proceed - Add heredoc warning to coding prompt - sandbox blocks /tmp access for here documents - All 162 security tests pass

coderabbitai · 2026-01-29T07:04:46Z

📝 Walkthrough

Walkthrough

Two focused updates improve commit message guidance and command extraction robustness. The template file now includes detailed git commit rules with examples, while the security module gains a fallback helper function to handle malformed command segments gracefully instead of failing.

Changes

Cohort / File(s)	Summary
Documentation & Guidelines `.claude/templates/coding_prompt.template.md`	Added comprehensive "Git Commit Rules" section with specific guidance on flag usage, heredoc limitations, and multi-line message formatting. Expanded examples demonstrate both multi-line and single-message commit workflows.
Command Extraction Resilience `security.py`	Added new `_extract_primary_command()` helper function to heuristically extract primary commands from malformed segments. Modified `extract_commands()` to use this fallback when `shlex.split()` fails, improving robustness without returning empty results.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 Git commits now shine with clarity,
Rules and examples, no disparity!
Commands stumble? No need to fear—
Fallback extraction keeps paths clear!
hops merrily 🥕

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately reflects both main changes: adding a shlex fallback parser (in security.py) and a heredoc warning (in the coding prompt template).
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@security.py`:
- Around line 211-217: In the except ValueError block that currently calls
_extract_primary_command(segment) and appends fallback_cmd to commands, add an
explicit security-level log entry (including timestamp, the original segment,
the fallback_cmd, and context like caller or request id if available) whenever
fallback extraction is used; ensure the logger used is the module/application
security/audit logger and that the message clearly flags this as a fallback
extraction for malformed input so these events are auditable and easy to search
later.

🧹 Nitpick comments (1)

security.py (1)

198-198: Minor note: consider removing the comment.

The comment # (re is already imported at module level) is helpful during review but may become stale or add noise over time. Consider removing it since the import at line 10 is self-evident.

coderabbitai · 2026-01-29T07:09:48Z

security.py

        except ValueError:
            # Malformed command (unclosed quotes, etc.)
-            # Return empty to trigger block (fail-safe)
-            return []
+            # Try fallback extraction instead of blocking entirely
+            fallback_cmd = _extract_primary_command(segment)
+            if fallback_cmd:
+                commands.append(fallback_cmd)
+            continue


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's locate the security.py file and understand the structure find . -name "security.py" -type f | head -5

Repository: leonvanzyl/autocoder

Length of output: 77

🏁 Script executed:

# Find test files related to security find . -name "*test*.py" -o -name "test_*" -type f | grep -i security | head -10

Repository: leonvanzyl/autocoder

Length of output: 113

🏁 Script executed:

# Search for the _extract_primary_command function and related code rg -n "_extract_primary_command|extract_commands" --type=py | head -20

Repository: leonvanzyl/autocoder

Length of output: 637

🏁 Script executed:

# Search for the bash_security_hook and understand the fail-safe mechanism rg -n "bash_security_hook|ValueError.*shlex" --type=py -A 3 -B 3

Repository: leonvanzyl/autocoder

Length of output: 11785

🏁 Script executed:

# Read the security.py file to understand the fallback mechanism and the surrounding context cat -n security.py | sed -n '143,220p'

Repository: leonvanzyl/autocoder

Length of output: 2848

🏁 Script executed:

# Look at the blocklist and how it's enforced rg -n "HARDCODED_BLOCKLIST|BLOCKLIST|sudo|dd|shutdown" security.py -A 2 -B 2 | head -50

Repository: leonvanzyl/autocoder

Length of output: 1513

🏁 Script executed:

# Examine the test_extract_commands function to see what's tested cat -n test_security.py | sed -n '97,150p'

Repository: leonvanzyl/autocoder

Length of output: 2292

🏁 Script executed:

# Search for tests with malformed/unclosed commands rg -n "unclosed|malformed|shlex|ValueError" test_security.py -A 3 -B 2

Repository: leonvanzyl/autocoder

Length of output: 46

🏁 Script executed:

# Check if there are any tests with chained commands and malformed input rg -n "&&|;|pipe.*unclosed|malformed.*&&" test_security.py -A 5 -B 5

Repository: leonvanzyl/autocoder

Length of output: 2062

🏁 Script executed:

# See how the blocklist is enforced in bash_security_hook cat -n security.py | sed -n '787,850p'

Repository: leonvanzyl/autocoder

Length of output: 3020

🏁 Script executed:

# Look at the full validate_command function to understand blocklist enforcement rg -n "def validate_command|BLOCKED_COMMANDS|DANGEROUS_COMMANDS" security.py -A 5 | head -100

Repository: leonvanzyl/autocoder

Length of output: 1453

🏁 Script executed:

# Check if there are any tests with malformed input (unclosed quotes) rg -n "unclosed|malformed|shlex.*fail|ValueError" test_security.py

Repository: leonvanzyl/autocoder

Length of output: 46

🏁 Script executed:

# Look for tests that specifically test the fallback mechanism rg -n "fallback|_extract_primary_command" test_security.py

Repository: leonvanzyl/autocoder

Length of output: 46

🏁 Script executed:

# Search entire codebase for any tests with malformed commands containing chained operations rg -n '".*&&.*".*unclosed|unclosed.*&&' test_security.py test_security_integration.py

Repository: leonvanzyl/autocoder

Length of output: 46

🏁 Script executed:

# Let me check if there are any tests with malformed commands in general rg -n 'unclosed|malformed|".*\\$' test_security.py test_security_integration.py -A 3 -B 1

Repository: leonvanzyl/autocoder

Length of output: 46

🏁 Script executed:

# Check if the fallback extraction is tested with any dangerous commands rg -n 'sudo|dd|shutdown' test_security.py -A 2 -B 2

Repository: leonvanzyl/autocoder

Length of output: 1638

🏁 Script executed:

# Let's trace through a specific test case with chained commands to understand the extraction cat -n test_security.py | sed -n '1075,1130p'

Repository: leonvanzyl/autocoder

Length of output: 660

🏁 Script executed:

# Check if there are any edge case tests with malformed input rg -n "test.*malformed|test.*unclosed|test.*fallback" test_security.py -A 10

Repository: leonvanzyl/autocoder

Length of output: 46

🏁 Script executed:

# Let me examine the complete extract_commands function to see all code paths cat -n security.py | sed -n '182,280p'

Repository: leonvanzyl/autocoder

Length of output: 3457

🏁 Script executed:

# Let me check what the code did BEFORE the fallback was introduced # by looking at git history to see the original behavior git log --oneline -20 -- security.py

Repository: leonvanzyl/autocoder

Length of output: 122

🏁 Script executed:

# Check if there are any tests that specifically test the behavior when extract_commands returns empty rg -n "extract_commands.*\[\]|empty.*commands|extract.*fail" test_security.py -B 3 -A 3

Repository: leonvanzyl/autocoder

Length of output: 46

🏁 Script executed:

# Let's create a test case to verify the concern - malformed input with chained dangerous commands python3 << 'EOF' import sys sys.path.insert(0, '.') from security import extract_commands, _extract_primary_command # Test case 1: Malformed segment with chained dangerous command test_cases = [ # Normal cases ('docker exec php -r "echo test"', "normal docker with unclosed quote simulation"), # Malformed with chained command ('valid_cmd "unclosed && sudo rm -rf /', "chained dangerous after unclosed quote"), ('ls "unclosed && dd if=/dev/zero', "dd chained after unclosed quote"), ('echo "test && shutdown -h now', "shutdown chained after unclosed quote"), ] print("Testing extract_commands with malformed + chained dangerous commands:\n") for cmd, description in test_cases: result = extract_commands(cmd) print(f"Input: {cmd!r}") print(f"Description: {description}") print(f"Extracted commands: {result}") print() EOF

Repository: leonvanzyl/autocoder

Length of output: 631

🏁 Script executed:

# Check if the fallback was recently added or if it was always there git log -p -- security.py | grep -A 10 -B 10 "_extract_primary_command" | head -50

Repository: leonvanzyl/autocoder

Length of output: 46

🏁 Script executed:

# See when the fallback logic was introduced git log --oneline --all | head -5

Repository: leonvanzyl/autocoder

Length of output: 402

🏁 Script executed:

# Check what happens when extract_commands returns empty list in bash_security_hook cat -n security.py | sed -n '808,820p'

Repository: leonvanzyl/autocoder

Length of output: 574

🏁 Script executed:

# Let me test what happens with a pure malformed command (no valid first command) python3 << 'EOF' import sys sys.path.insert(0, '.') from security import extract_commands # Test cases where malformed has NO valid command test_cases = [ ('&&& malformed', "operators only"), ('"unclosed only', "just unclosed quote"), ('; ; ;', "only operators"), ('|||', "only pipes"), ] print("Testing extract_commands with ONLY malformed syntax:\n") for cmd, description in test_cases: result = extract_commands(cmd) print(f"Input: {cmd!r}") print(f"Description: {description}") print(f"Extracted commands: {result}") print(f"Length: {len(result)}") print() EOF

Repository: leonvanzyl/autocoder

Length of output: 451

🏁 Script executed:

# Check the commit that introduced the fallback to understand the original behavior git show d47028d --stat

Repository: leonvanzyl/autocoder

Length of output: 11619

🏁 Script executed:

# Let's see the actual changes made in that commit git show d47028d

Repository: leonvanzyl/autocoder

Length of output: 50380

🏁 Script executed:

# Check what tests exist for malformed input and fallback rg -n "test.*fallback|test.*malformed|test.*unclosed|test.*shlex" test_security.py -A 5 -B 2

Repository: leonvanzyl/autocoder

Length of output: 46

🏁 Script executed:

# Verify that all BLOCKED_COMMANDS and DANGEROUS_COMMANDS are in the blocklist enforcement cat -n security.py | sed -n '72,115p'

Repository: leonvanzyl/autocoder

Length of output: 1268

🏁 Script executed:

# Verify the blocklist is actually enforced in bash_security_hook cat -n security.py | sed -n '834,860p'

Repository: leonvanzyl/autocoder

Length of output: 1541

🏁 Script executed:

# Test if a dangerous command can bypass when malformed python3 << 'EOF' import sys sys.path.insert(0, '.') from security import extract_commands, get_effective_commands # Test if sudo/dd can bypass in malformed segments malformed_tests = [ 'valid_cmd "unclosed && sudo apt install', 'echo test; sudo rm -rf /', 'ls "unclosed; dd if=/dev/zero', ] dangerous_set = {"sudo", "dd", "shutdown", "reboot"} print("Testing if dangerous commands extracted from malformed segments:\n") for cmd in malformed_tests: extracted = extract_commands(cmd) has_dangerous = any(c in dangerous_set for c in extracted) print(f"Input: {cmd!r}") print(f"Extracted: {extracted}") print(f"Contains dangerous: {has_dangerous}") print() # Check the blocklist allowed, blocked = get_effective_commands(None) print("Dangerous commands in blocklist:") for cmd in ["sudo", "dd", "shutdown", "reboot", "su"]: in_blocked = cmd in blocked print(f" {cmd}: {in_blocked}") EOF

Repository: leonvanzyl/autocoder

Length of output: 507

🏁 Script executed:

# Let me verify the actual security concern more thoroughly # Test the specific example from the review comment python3 << 'EOF' import sys sys.path.insert(0, '.') import asyncio from security import bash_security_hook test_cases = [ # The example from the review comment 'docker exec php -r "unclosed && sudo rm -rf /', # Variations 'valid_cmd "unclosed && sudo apt install package', 'echo test; sudo rm -rf /', # This should work (extracted) 'ls "unclosed && shutdown -h now', ] async def test_command(cmd): input_data = {"tool_name": "Bash", "tool_input": {"command": cmd}} result = await bash_security_hook(input_data) decision = result.get("decision", "allow") reason = result.get("reason", "allowed by default") return decision, reason print("Testing bash_security_hook with malformed + dangerous commands:\n") for cmd in test_cases: decision, reason = asyncio.run(test_command(cmd)) print(f"Command: {cmd!r}") print(f"Decision: {decision}") if decision == "block": print(f"Reason: {reason[:100]}...") print() EOF

Repository: leonvanzyl/autocoder

Length of output: 586

🏁 Script executed:

# Check if there are any tests currently for the fallback behavior grep -r "fallback\|_extract_primary_command" test_security.py test_security_integration.py

Repository: leonvanzyl/autocoder

Length of output: 46

🏁 Script executed:

# Look for any TODO or FIXME comments about the fallback rg -n "TODO|FIXME|XXX" security.py | head -20

Repository: leonvanzyl/autocoder

Length of output: 46

🏁 Script executed:

# Let me trace through the logic more carefully for the docker case python3 << 'EOF' import sys sys.path.insert(0, '.') from security import extract_commands, is_command_allowed, get_effective_commands # The critical case cmd = 'docker exec php -r "unclosed && sudo rm -rf /' print(f"Testing: {cmd!r}\n") # Step 1: What gets extracted? extracted = extract_commands(cmd) print(f"Step 1 - Extracted commands: {extracted}") # Step 2: Is docker allowed? allowed, blocked = get_effective_commands(None) docker_allowed = is_command_allowed("docker", allowed) print(f"Step 2 - Is 'docker' allowed? {docker_allowed}") print(f" Docker in allowed: {'docker' in allowed}") print(f" Docker in blocked: {'docker' in blocked}") # Step 3: Was sudo extracted? sudo_extracted = "sudo" in extracted print(f"Step 3 - Was 'sudo' extracted? {sudo_extracted}") print(f" Sudo in blocked: {'sudo' in blocked}") print(f"\nResult: Command ALLOWED because docker is allowed and sudo was NOT extracted") print(f"Risk: sudo is present in the command string but never validated") EOF

Repository: leonvanzyl/autocoder

Length of output: 473

🏁 Script executed:

# Check what's in the default allowlist for docker rg -n "docker" security.py | head -20

Repository: leonvanzyl/autocoder

Length of output: 168

🏁 Script executed:

# Verify the default allowed commands cat -n security.py | sed -n '35,70p'

Repository: leonvanzyl/autocoder

Length of output: 1116

Add logging when fallback command extraction is used to ensure security-critical misses are auditable.

The fallback extraction for malformed segments (e.g., docker exec php -r "unclosed && sudo rm -rf /) can allow dangerous commands to bypass validation. In the example above:

docker is extracted and allowed

sudo inside the malformed quote is never extracted or validated

The command passes the security check despite containing a blocked command

While such malformed input typically fails shell parsing as well (defense-in-depth), this creates an unauditable security blind spot. The original fail-safe behavior blocked unparseable commands entirely; the new behavior extracts only the primary command and allows it if safe.

Consider adding explicit logging (with timestamp and command context) when fallback extraction is triggered, so security-sensitive operations are auditable. Alternatively, document this tradeoff as an accepted risk in the project's security model.

🤖 Prompt for AI Agents

In `@security.py` around lines 211 - 217, In the except ValueError block that currently calls _extract_primary_command(segment) and appends fallback_cmd to commands, add an explicit security-level log entry (including timestamp, the original segment, the fallback_cmd, and context like caller or request id if available) whenever fallback extraction is used; ensure the logger used is the module/application security/audit logger and that the message clearly flags this as a fallback extraction for malformed input so these events are auditable and easy to search later.

leonvanzyl · 2026-01-29T07:17:40Z

Thanks

- Add debug logging when shlex fallback extraction is used, capturing both successful extractions and failures for security auditing - Add test case for docker nested quotes that trigger fallback parser - Remove redundant comment about re import (already at module level) Follow-up improvements from PR #127 code review: - Enables tracking of malformed command patterns in production logs - Verifies fallback parser handles the exact docker exec case reported Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…ture AutoForgeAI#127) - 63 tests verifying all 9 feature steps for FeatureCompiler.compile() - Validates task_type mapping, tool_policy, budgets, acceptance_spec, traceability - Confirms 5 distinct task_types across multi-feature DB run (coding, testing, audit, documentation, refactoring) - Zero regressions in existing 46-test E2E suite Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…shlex-fallback fix: add shlex fallback parser and heredoc warning

- Add debug logging when shlex fallback extraction is used, capturing both successful extractions and failures for security auditing - Add test case for docker nested quotes that trigger fallback parser - Remove redundant comment about re import (already at module level) Follow-up improvements from PR AutoForgeAI#127 code review: - Enables tracking of malformed command patterns in production logs - Verifies fallback parser handles the exact docker exec case reported Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

coderabbitai bot reviewed Jan 29, 2026

View reviewed changes

leonvanzyl merged commit 77b91ca into AutoForgeAI:master Jan 29, 2026
3 checks passed

cabana8471-arch deleted the fix/security-shlex-fallback branch January 29, 2026 16:35

CoreAspectStu pushed a commit to CoreAspectStu/autocoder-custom that referenced this pull request Feb 9, 2026

Merge pull request AutoForgeAI#127 from cabana8471-arch/fix/security-…

f7a348d

…shlex-fallback fix: add shlex fallback parser and heredoc warning

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

fix: add shlex fallback parser and heredoc warning#127

fix: add shlex fallback parser and heredoc warning#127
leonvanzyl merged 1 commit intoAutoForgeAI:masterfrom
cabana8471-arch:fix/security-shlex-fallback

cabana8471-arch commented Jan 29, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 29, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jan 29, 2026

Uh oh!

Uh oh!

leonvanzyl commented Jan 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

cabana8471-arch commented Jan 29, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

leonvanzyl commented Jan 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cabana8471-arch commented Jan 29, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 29, 2026 •

edited

Loading